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— The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 


A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )IEI Responsive to communication(s) filed on 16 September 2004 . 
2a)D This action is FINAL. 2b)^ This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 

Disposition of Claims 

4) I3 Claim(s) 15 to 28 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) I3 Claim(s) 15 to 26 and 28 is/are rejected. 

7) [X] Claim(s) 27 is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10) |EI The drawing(s) filed on 16 July 2004 is/are: a)M accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

11) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12) £3 Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 119(a)-(d) or (f). 
a)|EI All b)D Some * c)D None of: 

1 .[3 Certified copies of the priority documents have been received. 

2.D Certified copies of the priority documents have been received in Application No. . 

3-D Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 


Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

2. Claims 15, 16, 20 to 23, and 28 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Power et a/. 

Regarding independent claims 15 and 28, Power et al. discloses a speech 
recognition method and system, comprising: 

"performing a preliminary speech recognition of a voice signal to segment the 
voice signal into words and pauses and converting the words into text" - classifier 34 
receives successive feature vectors and operates on each with a plurality of models 
corresponding to different words, phonemes or phrases to generate recognition results 
(column 4, lines 54 to 57: Figure 2); classifier 34 comprises a classifying process 341 
and a state memory 342; a state field is provided for noise/silence state at the beginning 
of a word and state field for a noise/silence state at the end of a word (column 5, lines 
24 to 37: Figure 3); implicitly, recognition of particular words implies "converting the 
words into text"; 

"determining an average silence volume during the pauses" - pause detector 37 
comprises a SNR detector 372 to read an average energy level buffer 376 to determine 
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a representative energy level over the frames currently identified as speech and a 
representative energy level over frames currently represented as noise; the 
representative measure comprises the mean running energy level running over the 
noise segments (column 8, lines 20 to 32: Figure 10); 

"determining an average word volume for the words" - pause detector 37 
comprises a SNR detector 372 to read an average energy level buffer 376 to determine 
a representative energy level over the frames currently identified as speech and a 
representative energy level over frames currently represented as noise; the 
representative measure comprises a peak average energy level over the speech 
segment (column 8, lines 20 to 32: Figure 10); 

"calculating a difference between the average word volume and the average 
silence volume" - a signal to noise ratio value, SNR, is calculated (column 8, lines 33 to 
33 to 39: Figures 10 and 12); a signal to noise ratio, SNR, represents a "difference" 
because a ratio compares the relative magnitudes of two quantities; 

"evaluating a word, having a volume difference between the average word 
volume and the average silence volume is lower than a predetermined threshold, as 
having been incorrectly recognized" - rejector 36 is arranged to test the confidence of 
identification of a word by parser 35; if the identification is suspect, it is rejected; silence 
is detected by testing whether the SNR calculated by the SNR detector 373 lies below a 
very low threshold; the tests performed by the rejector include a test using the signal to 
noise ratio calculated by SNR detector 372 to reject noisy conditions and out-of- 
vocabulary words (column 9, line 51 to column 10, line 43: Figures 2 and 17). 
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Regarding claim 16, it is implicit that signal to noise ratios are measured in 
decibels; a decibel is a logarithm of a signal energy. 

Regarding claim 20, Power et ai discloses thresholds, but does not say that the 
thresholds are adjusted or adapted; thus, the thresholds are constants. 

Regarding claim 21, Power et a/, discloses a recognition rejector 36 is arranged 
to reject recognition of a word recognized by parser 35 if recognition is unreliable 
(column 4, lines 61 to 66: Figure 2). 

Regarding claims 22 and 23, Power et at. discloses rejector 36 issues a "query" 
signal which enables the utilizing apparatus 4 to initiate a confirmatory dialogue by 
synthesizing a phrase asking the user to repeat the word (column 9, lines 52 to 61 ); a 
confirmatory dialogue is "a message". 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claim 24 is rejected under 35 U.S.C. 103(a) as being unpatentable over Power et 
al. in view of Polikaitis et a/. 

Power etai. discloses a recognition rejector 36 arranged to reject recognition of a 
word recognized by parser 35 if recognition is unreliable, wherein a rejection signal is 
output from the rejector 36 to control signal output 38 for use in controlling a utilizing 
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apparatus 4 (column 4, line 61 to column 5, line 2: Figures 1 and 2). However, Power et 
al. does not specifically provide a message output for a user to speak louder so that an 
adequate distance is achieved between the average word volume and the average 
silence volume. Polikaitis et a/, teaches a speech recognition device for screening 
speech input, wherein error procedures are performed if ratios of speech energy are 
less than various thresholds. Specifically, Polikaitis et al. discloses if Control4 is option 
A, the user is prompted in step 270 to repeat the voice instruction and is prompted to 
speak louder (column 9, lines 5 to 8: Figure 2). Implicitly, speaking louder causes 
SpeechEnergy ("average word volume") to increase relative to NoiseEnergy ("average 
silence volume") as an increased signal-to-noise ratio ("so that an adequate distance is 
achieved"). Polikaitis et al. teaches an objective of screening speech input so that a 
speech recognizer operates correctly because speech recognition technology does not 
work well when the user speaks too softly. (Column 1, Line 39 to Column 2, Line 12) It 
would have been obvious to one having ordinary skill in the art to provide a message to 
a user to speak louder as taught by Polikaitis et al. in the speech recognition system 
with sequence parsing and rejection of Power et al. for the purpose of screening speech 
input so that a recognizer operates correctly when the user speaks too softly. 

5. Claims 17 to 19 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Power et al. in view of Wu et al. 

Regarding claim 17, Power et al. does not disclose adapting thresholds on the 
basis of the global difference, although adaptive thresholds are fairly well known. Wu et 
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al. teaches a generally similar speech recognition method for analyzing endpoints in 
speech with signal-to-noise ratios, where speech recognition is only performed if a 
predetermined restart threshold level is identified. (Column 9, Line 56 to Column 10, 
Line 5) Wu et al. employs adaptive thresholds, T s , T e , T sr , T er , defined in terms of an 
average background noise level N bg , and average speech energy levels, E !s and £z/ e . 
(Column 7, Line 25 to Column 9, Line 31: Figures 8, 9(a) and 9(b)) Specifically, Wu et 
al. says the method is advantageous for eliminating errors due to mistaking breathing 
for actual speech. (Column 9, Line 56 to Column 10, Line 5) It would have been 
obvious to one having ordinary skill in the art to employ adaptive thresholds defined in 
term of average speech energy and average noise energy as suggested by Wu et al. for 
the thresholds of Power et al. in order to eliminate errors due to mistaking breathing for 
actual speech. 

Regarding claim 18 Wu et al. discloses the thresholds are related to the signal- 
to-noise ratios, defined in terms of differences E /s - N bg and E /e - (column 8, lines 24 
to 65). 

Regarding claim 19, Wu et al. discloses general formulae for adaptive thresholds 
T sr and T er , where the thresholds are diminished by a factor -C3 N bg , and C3 is a constant 
to account for conditions of unstable background noise (column 9, lines 20 to 31). 

6. Claims 25 and 26 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Power et al. in view of Polikaitis et al. as applied to claim 24 above, and further in 
view of Wu et al. 
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Regarding claim 25, Power et al. discloses thresholds, but does not determine an 
average silence volume for an individual pause. However, Wu et al. teaches a 
generally similar speech recognition method for analyzing endpoints in speech with 
signal-to-noise ratios, where speech recognition is only performed if a predetermined 
restart threshold level is identified. (Column 9, Line 56 to Column 10, Line 5) Wu et al. 
determines an average background noise level on the basis of segments of silence 
energy defining a reliable island. (Column 7, Lines 25 to 42: Figure 8) Similarly, Wu et 
al. determines average speech energy levels, E /s and E !e , on the basis of segments of 
speech energy defining a reliable island. (Column 7, Line 58 to Column 8, Line 23: 
Figures 9(a) and 9(b)). Wu et al. says the method is advantageous for eliminating 
errors due to mistaking breathing for actual speech. (Column 9, Line 56 to Column 10, 
Line 5) It would have been obvious to one having ordinary skill in the art to determine a 
difference between average speech energy and average noise energy in terms of 
immediately preceding or immediately following pauses as suggested by Wu et al. 
instead of the average noise energy and peak average speech energy of Power et al. 
for the purpose of eliminating errors due to mistaking breathing for actual speech. 

Regarding claim 26, Power et al. discloses thresholds, but does not expressly 
average silence volume over a plurality of successive pauses to determine a difference 
between average word volume and average silence volume. However, Wu et al. 
determines an average background noise level, N bg , on the basis of segments of silence 
energy defining a reliable island, and similarly, determines average speech energy 
levels, E/s and E, e , on the basis of segments of speech energy defining a reliable island. 
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(Column 7, Line 25 to Column 8, Line 23: Figures 8, 9(a), and 9(b)). Wu et al. says the 
method is advantageous for eliminating errors due to mistaking breathing for actual 
speech. (Column 9, Line 56 to Column 10, Line 5) It would have been obvious to one 
having ordinary skill in the art to combine the segmental energy averaging method of 
Wu et al. for the average noise energy and peak average speech energy of Power et ai 
so as to determine the global average silence energy on the basis of a sum of the 
energies of successive silence segments for the purpose of eliminating errors due to 
mistaking breathing for actual speech. 

Allowable Subject Matter 

7. Claim 27 is objected to as being dependent upon a rejected base claim, but 
would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

Response to Arguments 

8. Applicants' arguments filed 16 September 2004 have been considered but are 
moot in view of the new grounds of rejection. 

Conclusion 

9. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Shaffer et al. discloses related art. 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (703) 308- 
9064. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (703) 305-9645. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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